NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Tags, Borders, and Catalogs: Social Re-Working of Genre on LibraryThing

https://doi.org/10.1145/3449103

Antoniak, Maria; Walsh, Melanie; Mimno, David (April 2021, Proceedings of the ACM on Human-Computer Interaction)

Through a computational reading of the online book reviewing community LibraryThing, we examine the dynamics of a collaborative tagging system and learn how its users refine and redefine literary genres. LibraryThing tags are overlapping and multi-dimensional, created in a shared space by thousands of users, including readers, bookstore owners, and librarians. A common understanding of genre is that it relates to the content of books, but this resource allows us to view genre as an intersection of user communities and reader values and interests. We explore different methods of computational genre measurement within the open space of user-created tags. We measure overlap between books, tags, and users, and we also measure the homogeneity of communities associated with genre tags and correlate this homogeneity with reviewing behavior.Finally, by analyzing the text of reviews, we identify the thematic signatures of genres on LibraryThing, revealing similarities and differences between them. These measurements are intended to elucidate the genre conceptions of the users, not, as in prior work, to normalize the tags or enforce a hierarchy. We find that LibraryThing users make sense of genre through a variety of values and expectations, many of which fall outside common definitions and understandings of genre.
more » « less
Full Text Available
Bad Seeds: Evaluating Lexical Methods for Bias Measurement

https://doi.org/10.18653/v1/2021.acl-long.148

Antoniak, Maria; Mimno, David (January 2021, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers))

A common factor in bias measurement methods is the use of hand-curated seed lexicons, but there remains little guidance for their selection. We gather seeds used in prior work, documenting their common sources and rationales, and in case studies of three English-language corpora, we enumerate the different types of social biases and linguistic features that, once encoded in the seeds, can affect subsequent bias measurements. Seeds developed in one context are often re-used in other contexts, but documentation and evaluation remain necessary precursors to relying on seeds for sensitive measurements.
more » « less
Full Text Available
Narrative Paths and Negotiation of Power in Birth Stories

https://doi.org/10.1145/3359190

Antoniak, Maria; Mimno, David; Levy, Karen (November 2019, Proceedings of the ACM on Human-Computer Interaction)

Full Text Available
Evaluating the Stability of Embedding-based Word Similarities

Antoniak, Maria; Mimno, David (January 2018, Transactions of the Association for Computational Linguistics)

Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora.
more » « less
Full Text Available

Search for: All records